The presentation of data in a pictorial or graphical format.
The most important but dangerous element of data analytics.
There are a few basic concepts that can help you generate the best visuals for displaying your data:
Understand your data.
Determine what you want to communicate.
Know your audience.
Only 3 ingredients are required to make a plot.
Always begin with the main function in ggplot2: ggplot
**Data are specified via the “data” argument:
This argument supplies a coordinate system to add layers to.
aes() maps variables from a data set to various elements of a plot
Discrete values (groups / categories) can have color, shape, linetype, or fill mappings.
Points can have an additional x and y position mappings.
Mappings go into the aes() function as the 2nd argument in ggplot().
Any part of the plot related to the data goes in aes()
-geoms are the type of geometrics in your plot.
Common geoms include:
geom_boxplot()geom_histogram()geom_line()geom_density()geom_bar()geom_point()ggplot() is built in layers
Use the + operator to add layers to the exisiting ggplot() object.
In this way, your code is explicit about which layers are added and in what order.
To build your plots layer by layer, you use a continuous combination of geoms:
PSA:
Adding layer by layer:
You do not need to create an object for the plot:
BUT you can assign your plot to a variable…
…and then print / view your plot
Visualize the distribution of continuous variables by plotting its five-number summary:
One continuous variable and one discrete variable
Discrete variables can also be used to differentiate plot elements by including in the aes() function
Discrete variables can also be used to differentiate plot elements by including in the aes() function